Multi-Scale Correlations in Continuous Genomic Data

نویسندگان

  • Robert E. Thurman
  • William Stafford Noble
  • John A. Stamatoyannopoulos
چکیده

Functional genomic quantities such as histone modifications, chromatin accessibility, and evolutionary constraint can now be measured in a nearly continuous fashion across the genome. The genome is highly heterogeneous, and the relationships between different functional annotations may be fluid. Here we present an approach for visualizing, quantifying, and determining the statistical significance of local and regional correlations between high-density continuous genomic datasets. We use wavelets to generate a multi-scale view of each component data set and calculate correlations between data types as a function of genome position over a continuous range of scales in sliding window fashion. We determine the statistical significance of correlations using a non-parametric sampling approach. We apply the wavelet correlation method to histone modification and chromatin accessibility (DNasel sensitivity) data from the NHGRI ENCODE project. We show that DNaseI sensitivity is broadly correlated (though to differing degrees) with a number of different activating histone modifications. We examine the continuous relationship between the repressive histone modification H3K27me3 and the activating mark H3K4me2, and find these modifications to display significant duality, with both significant positively and negatively correlated genomic territories. While the former appear to recapitulate in definitive cells the so-called "bi-valent" pattern originally proposed as a signature of pluripotency, the presence of negatively correlated regions suggests that the regulatory events that underlie the observed modification patterns are complex and highly regionalized in the genome.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

MODMatcher: Multi-Omics Data Matcher for Integrative Genomic Analysis

Errors in sample annotation or labeling often occur in large-scale genetic or genomic studies and are difficult to avoid completely during data generation and management. For integrative genomic studies, it is critical to identify and correct these errors. Different types of genetic and genomic data are inter-connected by cis-regulations. On that basis, we developed a computational approach, Mu...

متن کامل

Suggestion of New Correlations for Drop/Interface Coalescence Phenomena in the Absence and Presence of Single Surfactant

After designing and constructing a coalescence cell, drop/interface coalescence phenomenon was studied in the absence and presence of single surfactant.Two surface active agents of sodium dodecyl sulfate and 1-decanol were used. Distilled water was used as dispersed phase. Toluene, n-heptane and aqueous 60% (v/v) of glycerol were selected as continuous phases, separately. It was found that ...

متن کامل

Shape-based alignment of genomic landscapes in multi-scale resolution

Due to dramatic advances in DNA technology, quantitative measures of annotation data can now be obtained in continuous coordinates across the entire genome, allowing various heterogeneous 'genomic landscapes' to emerge. Although much effort has been devoted to comparing DNA sequences, not much attention has been given to comparing these large quantities of data comprehensively. In this article,...

متن کامل

Discovery of multi-dimensional modules by integrative analysis of cancer genomic data

Recent technology has made it possible to simultaneously perform multi-platform genomic profiling (e.g. DNA methylation (DM) and gene expression (GE)) of biological samples, resulting in so-called 'multi-dimensional genomic data'. Such data provide unique opportunities to study the coordination between regulatory mechanisms on multiple levels. However, integrative analysis of multi-dimensional ...

متن کامل

Multi-population Genomic Relationships for Estimating Current Genetic Variances Within and Genetic Correlations Between Populations.

Different methods are available to calculate multi-population genomic relationship matrices. Since those matrices differ in base population, it is anticipated that the method used to calculate genomic relationships affects the estimate of genetic variances, covariances, and correlations. The aim of this article is to define the multi-population genomic relationship matrix to estimate current ge...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing

دوره   شماره 

صفحات  -

تاریخ انتشار 2008